{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1b68bccb-a98b-48c2-889a-10e5c4822eeb",
   "metadata": {
    "user_expressions": []
   },
   "source": [
    "# How to use Awkward Arrays in C++ with cppyy"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f80c4c0d-f4da-4e1e-adb8-90f8482d23e8",
   "metadata": {
    "tags": [],
    "user_expressions": []
   },
   "source": [
    ":::{warning}\n",
    "\n",
    "Awkward Array can only work with `cppyy` 3.1 or later.\n",
    ":::\n",
    "\n",
    ":::{warning}\n",
    "`cppyy` must be in a different venv or conda environment from ROOT, if you have installed ROOT, because the two packages define modules with conflicting names.\n",
    ":::\n",
    "\n",
    "The [cppyy](https://cppyy.readthedocs.io/en/latest/index.html) is an automatic, run-time, Python-C++ bindings generator, for calling C++ from Python and Python from C++. `cppyy` is based on the C++ interpreter `Cling`.\n",
    "\n",
    "`cppyy` can understand Awkward Arrays. When an {class}`ak.Array` type is passed to a C++ function defined in `cppyy`, a `__cast_cpp__` magic function of an {class}`ak.Array` is invoked. The function dynamically generates a C++ type and a view of the array, if it has not been generated yet.\n",
    "\n",
    "The view is a lightweight 40-byte C++ object dynamically allocated on the stack. This view is generated on demand - and only once per Awkward Array, the data are not copied."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "48778e8a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'2.1.1'"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import awkward as ak\n",
    "ak.__version__"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "dd32d294",
   "metadata": {},
   "outputs": [],
   "source": [
    "import awkward._connect.cling"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "a87d01b8",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'3.0.0'"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import cppyy\n",
    "cppyy.__version__"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "39b1877e",
   "metadata": {},
   "source": [
    "Let's define an Awkward Array as a list of records:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "97f90216",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre>[[{x: 1, y: [1.1]}, {x: 2, y: [2.2, 0.2]}],\n",
       " [],\n",
       " [{x: 3, y: [3, 0.3, 3.3]}]]\n",
       "-------------------------------------------\n",
       "type: 3 * var * {\n",
       "    x: int64,\n",
       "    y: var * float64\n",
       "}</pre>"
      ],
      "text/plain": [
       "<Array [[{x: 1, y: [1.1]}, {...}], ...] type='3 * var * {x: int64, y: var *...'>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "array = ak.Array(\n",
    "    [\n",
    "        [{\"x\": 1, \"y\": [1.1]}, {\"x\": 2, \"y\": [2.2, 0.2]}],\n",
    "        [],\n",
    "        [{\"x\": 3, \"y\": [3.0, 0.3, 3.3]}],\n",
    "    ]\n",
    ")\n",
    "array"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b5cc3b3b-8426-4def-96d5-be1314847bc4",
   "metadata": {},
   "source": [
    "This example shows a templated C++ function that takes an Awkward Array and iterates over the list of records:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "d4294ad6",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "True"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "source_code = \"\"\"\n",
    "template<typename T>\n",
    "double go_fast_cpp(T& awkward_array) {\n",
    "    double out = 0.0;\n",
    "\n",
    "    for (auto list : awkward_array) {\n",
    "        for (auto record : list) {\n",
    "            for (auto item : record.y()) {\n",
    "                out += item;\n",
    "            }\n",
    "        }\n",
    "    }\n",
    "\n",
    "    return out;\n",
    "}\n",
    "\"\"\"\n",
    "\n",
    "cppyy.cppdef(source_code)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "acecac23-fe0b-485c-8eaf-11c926124217",
   "metadata": {},
   "source": [
    "The C++ type of an Awkward Array is a made-up type;\n",
    "`awkward::ListArray_hyKwTH3lk1A`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "03ab8a70",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'awkward::ListArray_hyKwTH3lk1A'"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "array.cpp_type"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "240fcb74",
   "metadata": {},
   "source": [
    "Awkward Arrays are dynamically typed, so in a C++ context, the type name is hashed. In practice, there is no need to know the type. The C++ code should use a placeholder type specifier `auto`. The type of the variable that is being declared will be automatically deduced from its initializer.\n",
    "\n",
    "In a Python contexts, when a templated function requires a C++ type as a Python string, it can use the `ak.Array.cpp_type` property:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "0bea9b7d",
   "metadata": {},
   "outputs": [],
   "source": [
    "out = cppyy.gbl.go_fast_cpp[array.cpp_type](array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "f0fb71ec",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3.93 µs ± 135 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "out = cppyy.gbl.go_fast_cpp[array.cpp_type](array)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "11a7ecec",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "184 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)\n"
     ]
    }
   ],
   "source": [
    "%%timeit\n",
    "\n",
    "ak.sum(array[\"y\"])"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a1d1a195-9b14-4c48-b8af-f4b01e120537",
   "metadata": {},
   "source": [
    "But the result is the same."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "1d23590b",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "assert out == ak.sum(array[\"y\"])"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.9"
  },
  "widgets": {
   "application/vnd.jupyter.widget-state+json": {
    "state": {},
    "version_major": 2,
    "version_minor": 0
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
